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ABSTRACT Most fungal genomes are poorly annotated, and many fungal traits of industrial and biomedical relevance are not well 
suited to classical genetic screens. Assigning genes to phenotypes on a genomic scale thus remains an urgent need in the field. We 
developed an approach to infer gene function from expression profiles of wild fungal isolates, and we applied our strategy to the 
filamentous fungus Neurospora crassa. Using transcriptome measurements in 70 strains from two well-defined clades of this 
microbe, we first identified 2,247 cases in which the expression of an unannotated gene rose and fell across N. crassa strains in 
parallel with the expression of well-characterized genes. We then used image analysis of hyphal morphologies, quantitative 
growth assays, and expression profiling to test the functions of four genes predicted from our population analyses. The results 
revealed two factors that influenced regulation of metabolism of nonpreferred carbon and nitrogen sources, a gene that gov- 
erned hyphal architecture, and a gene that mediated amino acid starvation resistance. These findings validate the power of our 
population-transcriptomic approach for inference of novel gene function, and we suggest that this strategy will be of broad util- 
ity for genome-scale annotation in many fungal systems. 

IMPORTANCE Some fungal species cause deadly infections in humans or crop plants, and other fungi are workhorses of industrial 
chemistry, including the production of biofuels. Advances in medical and industrial mycology require an understanding of the 
genes that control fungal traits. We developed a method to infer functions of uncharacterized genes by observing correlated ex- 
pression of their mRNAs with those of known genes across wild fungal isolates. We applied this strategy to a filamentous fungus 
and predicted functions for thousands of unknown genes. In four cases, we experimentally validated the predictions from our 
method, discovering novel genes involved in the metabolism of nutrient sources relevant for biofuel production, as well as col- 
ony morphology and starvation resistance. Our strategy is straightforward, inexpensive, and applicable for predicting gene func- 
tion in many fungal species. 
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Fungi are estimated to account for 25% of the world's biomass 
( 1 ) and to comprise as many as 5 million species (2) . Almost all 
fungal species can grow as filaments that invade the substrate as 
they feed. However, most of what we know about the genetic basis 
of fungal growth and the coordination of nutrient acquisition, 
transport, and metabolism has come from research on Saccharo- 
myces cerevisiae, a unicellular yeast that feeds by diffusion (3). 
Recently, this situation has changed due to the remarkable suit- 
ability of small (~30-Mb), haploid, easily cultivatable and essen- 
tially immortal filamentous fungi as subjects for whole-genome 
sequencing, such that more than one-third of all eukaryotic, 
whole-genome sequences are fungal (4). This sequencing effort 
has led to a rich collection of genomes of many filamentous fungi 
but one that is poorly annotated because filamentous fungi harbor 
thousands of genes absent from unicellular yeast (5). In the fila- 
mentous fungus Neurospora crassa, a flagship model organism, 



-40% of the predicted genes remain annotated with no known 
function. 

Traditionally, functions of uncharacterized genes have often 
been discovered in screens of deletion mutants engineered in an 
isogenic background (6, 7). A powerful complementary approach 
instead exploits the genetic changes that have arisen naturally in 
wild populations. When variation across outbred individuals af- 
fects the regulation of genes of common function (8), the biolog- 
ical role of an unannotated gene falling into such a regulon can be 
inferred by reference to the annotations of the rest of the group 
(9). Unlike S. cerevisiae, where a heterogeneous population struc- 
ture combined with the small number of available wild isolates has 
made it difficult to perform genome-wide association studies (10), 
N. crassa is particularly well suited for population analyses owing 
to the detailed understanding of population structure ( 1 1 , 1 2 ) and 
the large and growing culture collection of wild strains in this 
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species (13-17). Here, we report on the use of expression as a 
genome-scale screening tool in fewer than 150 wild individuals, 
far fewer than the >8,000 mutants of predicted nonessential genes 
in N. crassa that would be screened for phenotype in a library of 
deletion mutants. 

We set out to survey transcriptional variation in wild N. crassa, 
both intrapopulation differences between strains isolated in Lou- 
isiana (18) and differences between the Louisiana population and 
one from the southern Caribbean (11). From these analyses, we 
formulated hypotheses about genes of unknown function that 
might mediate growth, metabolic, and regulatory traits in 
N. crassa, and we focused on several as case studies for validation 
of our methods. Our experiments targeted the roles of these genes 
with respect to three fundamental aspects of filamentous fungal 
biology: colony development, regulation of iron acquisition, and 
global regulation of nitrogen and carbon metabolism. The results 
led to the discovery of novel phenotypes for four previously un- 
characterized genes. 

RESULTS 

Inferring multigene regulons from expression variation across 
an N. crassa population. To survey variation in N. crassa gene 
expression, we made use of our transcriptional profiles recently 
generated from wild isolates of this fungus collected in Louisiana 
(19). Of the 9,733 predicted N. crassa genes, 8,876 had mapped 
reads in at least 24 of the wild isolates, and we considered the latter 
set of genes to represent the core active transcriptional program of 
N. crassa under the standard growth conditions of our cultured 
colonies. 

To harness regulatory variation across strains to infer gene 
function, we first used our expression profiles to define coex- 
pressed gene clusters, applying a resampling strategy to assess the 
significance of cluster sizes (see Materials and Methods). At a clus- 
ter size of nine genes, we identified 188 clusters whose gene ex- 
pression was correlated across wild strains with a correlation co- 
efficient of 0.4 or greater, whereas no such clusters were detected 
in permuted data sets (see Table SI in the supplemental material). 
The majority of clusters (92%) contained at least three genes that 
have been annotated in functional categories according to the 
Functional Catalogue (FunCat) (20). In 72% of these clusters, we 
detected functional category enrichment at a P value of S0.05 
(Benjamini-Hochberg-corrected hypergeometric test; see Data 
Set SI), thereby highlighting the potential of our clustering data 
set as a resource for the inference of function of uncharacterized 
genes. 

Molecular validation of a novel hyphal morphology gene. To 

investigate, at the molecular level, the inferred function of unchar- 
acterized N. crassa genes that underlie growth traits, we first fo- 
cused on a coexpressed cluster of genes (cluster 48 in Data Set S 1 in 
the supplemental material; see also Table S2) encoding proteins 
that (i) are localized to septa and cell walls (RH04, COT2, and 
ACW11), (ii) act as signaling molecules and transcription factors 
(RHOl, ADA6, BEK2, and CHM1), and (iii) have suspected roles 
in hyphal branching and septa formation (RH04, RHOl, and 
CHM1). Also among the genes in this cluster was NCU04826, 
which encodes a hypothetical protein lacking any annotated func- 
tion or protein domains. Sequence searches revealed homologs of 
this gene in species of filamentous fungi within the Sordariomy- 
cetes and Leotiomycetes, with amino acid identity to the N. crassa 
sequence ranging from 28% (Glarea lozoyensis) to 97% (Neuro- 
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FIG 1 Deletion of the novel Neurospora crassa gene hbc-1 causes a hyper- 
branching phenotype. Micrographs show branching morphologies of a colony 
of the wild-type strain FGSC 2489 (A) or an isogenic strain bearing a deletion 
inhbc-1 (NCU04826) (B). (C) Each column reports the distribution of a quan- 
titative measure of hyphal branching among progeny from a cross between the 
wild-type and the Ahbc-1 (hbc-l::HYG) strain, with the thick black horizontal 
bar reporting the median, 25% quartiles shown as a box, and whiskers extend- 
ing to 1.5 times the interquartile range. The x axis reports genotype across the 
complete panel of cross progeny strains at the hbc-1 locus: HYG-, segregants 
inheriting the wild-type hbc-1 gene; HYG+, segregants inheriting the hbc-1 
deletion. The y axis reports node count, determined by counting the number of 
branch junctions from the hyphal tip toward the colony center until a subten- 
ding branch was reached that was itself branching. *, the difference between 
node count in wild-type and hbc-1 mutant strains is significant at a Wilcoxon 
P value of 0.008. 



spora tetrasperma). Given the coregulation of NCU04826 with 
known cell wall and morphology genes, we hypothesized that this 
gene would have a role in hyphal morphology. To test this notion, 
we obtained an N. crassa strain of the FGSC (Fungal Genetics 
Stock Center) 2489 background harboring a deletion in 
NCU04826 (21), crossed it to an isogenic wild-type strain, and 
examined progeny for colony morphology. Strikingly, inheritance 
of the deletion cassette conferred a distinct hyperbranching phe- 
notype, as predicted from our coexpression analysis (Fig. 1). In 
assays of the branching phenotypes of strains bearing deletions of 
the other genes from the NCU04826 coexpression cluster (see 
Table S2), the colony and branching morphology of a strain car- 
rying a deletion of NCU02978 most closely resembled that of the 
ANCU04826 strain (see Fig. SI). NCU02978 is predicted to en- 
code a protein similar to Slalp in S. cerevisiae, an adaptor protein 
for endocytosis involved in assembly of the actin cytoskeleton. 
Consistent with a potential role for NCU04826 in the cytoskele- 
ton, we reevaluated the predicted protein product of NCU04826 
and detected homology to PFAM "intermediate filament" and 
"tropomyosin-like" protein families (PF00038 [E = 0.055] and 
PF12718 [E = 0.0005], respectively), both of which comprise 
cytoskeleton components. We conclude that NCU04826 repre- 
sents a previously uncharacterized determinant of colony mor- 
phology and hyphal branching, possibly by affecting activity of the 
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TABLE 1 Genes coregulated with asi-1 (NCU05257) across wild strains 


are affected by asi-1 deletion" 




ID 


Annotation 


un. r value 


Coexpression correlation 


NCU00522 


Cystathionine beta-lyase 


2.3104e-05 


0.78 


NCU00535 


Alanyl-tRNA synthetase 


0.0054 


0.84 


NCU02019 


FAD-dependent oxidoreductase 


1.5469e-08 


0.82 


NCU02543 


Aspartate aminotransferase 


0.0027 


0.92 


NCU05045 


MFS monocarboxylate transporter 


1.8375e-09 


0.90 


NCU05256 


Hypothetical 


0.0013 


0.81 


NCU07126 


Acetyltransferase (GNAT) family 


0.0503 


0.83 


NCU11365 


Aminotransferase 


0.0533 


0.82 



a Each row shows analysis of one gene whose expression changed significantly upon deletion of asi-1 and was tightly correlated with expression of asi-1 across Louisiana strains of 
N. crassa. DE P value, Benjamini-Hochberg-corrected significance of differential expression between an engineered asi-1 deletion mutant and an isogenic wild-type-strain; the 
complete set of genes responsive to asi-1 deletion is given in Data Set S2 in the supplemental material. Coexpression correlation, Spearman's rank correlation coefficient between 
the expression levels of the indicated gene and asi-1 across Louisiana isolates; the complete set of genes coregulated with asi-1 is given in Table S2. ID, identifier. 



cytoskeleton, underscoring the power of our coexpression cluster- 
ing methods to infer functions of unknown genes. We propose the 
name hbc-1, hyperbranching and cytoskeleton 1, for this gene. 

Molecular validation of a novel amino acid metabolism gene. 
We next set out to use our population genomics approach to in- 
vestigate the regulatory functions of transcription factors. The 
majority of annotated transcription factors in N. crassa have no 
known function or target gene set (6), and we reasoned that our 
clusters of genes defined by coregulation across wild isolates could 
be harnessed to infer the pathways in which these transcription 
factors act. As a positive control for this approach, we first exam- 
ined the 35 N. crassa transcription factors of known function (22) 
that fell into coexpression clusters enriched for one or more Fun- 
Cat terms: of these, we identified 15 whose FunCat annotation 
overlapped with that of genes in their respective clusters (see Data 
Set SI in the supplemental material; permutation P < 0.0001). For 
example, the known carbon catabolite repressor gene cre-1 was 
annotated in carbohydrate metabolism, as were the genes with 
which its expression was correlated (see cluster 176 in Data 
Set SI). 

To evaluate this strategy in the context of an unannotated tran- 
scription factor, we focused on the gene NCU05257, which en- 
codes a predicted zinc finger and homeobox DNA-binding pro- 
tein and which fell into an expression cluster containing 58 other 
genes in our analysis of expression among wild N. crassa strains 
(cluster 15 in Data Set SI in the supplemental material). This 
group was enriched for genes annotated in amino acid metabo- 
lism; see Data Set SI and Table S2), and NCU05257 was previously 
reported to be a putative target of the N. crassa amino acid bio- 
synthesis regulator CPC1 (23). To test the regulatory impact of 
NCU05257 directly, we used transcriptome sequencing (RNA- 
seq) to generate the transcriptional profile of a strain bearing a 
deletion in NCU05257, and we compared this profile to that of its 
isogenic wild-type control, identifying 43 genes affected by the 
NCU05257 mutation at a lenient statistical cutoff (Benjamini- 
Hochberg-corrected P value < 0.1; see Data Set S2). This expres- 
sion signature was enriched for genes annotated in amino acid 
metabolism (see Table S3), a conclusion that was independent of 
the threshold used to call differential expression (data not shown). 
Likewise, the NCU05257 deletion signature was enriched for the 
members of the cluster of genes with which it was coregulated 
across wild strains (eight genes present in both sets; Table 1; Fish- 
er's exact test, P = 9.76 X 10~ 9 ), and in a quantitative analysis, this 
cluster was enriched for dramatic expression change upon 
NCU05257 deletion (resampling P = 0.007). Interestingly, in the 



NCU05257 mutant, we also noted upregulation of several iron 
scavenging genes (the aerobactin siderophore biosynthesis pro- 
tein IUCB, the fatty acid coenzyme A [CoA] ligase NCU06063, 
which is involved in siderophore biosynthesis [24] , and the sid- 
erophore iron transporter NCU06132; see Data Set S2). 

To investigate the phenotypic role of NCU05257, we used the 
histidine biosynthesis inhibitor 3-amino-l,2,4-triazole (3-AT), 
which induces growth defects in N. crassa amino acid biosynthesis 
regulator mutants (25, 26). We crossed the NCU05257 deletion 
strain to an isogenic wild-type strain and measured the growth 
rate of progeny in the presence of 3-AT. In this cross, strains in- 
heriting ANCU05257 showed significantly compromised growth 
in 3-AT relative to that of wild-type progeny (Fig. 2), paralleling 
the behavior of a cpc-1 deletion strain (see Fig. S2 in the supple- 
mental material) and confirming the importance of NCU05257 in 
the cellular response to amino acid starvation. We conclude that 
NCU05257 is a novel component of the amino acid metabolic 
control network in N. crassa with a link to iron scavenging, and we 
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FIG 2 Deletion of asi-1 increases sensitivity to the histidine analogue 3-AT. 
Each column reports the distribution of growth rate among progeny from a 
cross between the wild-type strain, FGSC 4200, and an isogenic strain bearing 
a deletion in asi-1 (NCU05257; asi-i::HYG), with symbols as in Fig. 1 except 
that points >1. 5-fold outside the interquartile range are shown as circles. The 
x axis reports genotype in cross progeny at the asi-1 locus: HYG-, segregants 
inheriting the wild-type asi-1 gene; HYG+, segregants inheriting the asi-1 
deletion. The y axis reports the ratio between growth rate on Vogel's minimal 
medium (VMM) containing 6 mM 3-amino-l,2,4-triazole (3-AT) and growth 
rate on VMM lacking 3-AT. *, the difference between the growth rate on 3-AT 
in the wild-type and Aasi-i mutant strains is significant at a Wilcoxon P value 
of 0.008. 
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propose the name asi-1, for amino acid and siderophore regula- 
tion, for this gene. 

Regulatory variation between N. crassa populations and the 
nitrogen metabolite repression pathway. Having explored the 
genetic differences among Louisiana strains of N. crassa, we next 
investigated the variation between two populations by incorpo- 
rating transcriptional profiles previously reported for a southern 
Caribbean N. crassa population ( 1 1 ) . To polarize changes with the 
use of an outgroup, we also added profiles of an N. crassa popula- 
tion from Panama (11). Genome-scale comparisons between Ca- 
ribbean and Louisiana strains detected significant differential ex- 
pression of 1,539 genes (Benjamini-Hochberg correction, P < 
0.05; see Data Set S3 in the supplemental material). 

To use expression divergence between populations as a test bed 
for inference of gene function, we began with the genes most 
strongly differentially expressed between Louisiana and Carib- 
bean strains. At the top of the list was allantoicase (alc-1) (Fig. 3), 
a purine degradation gene canonically studied as part of the nitro- 
gen metabolite repression (NMR) program, for its repression in 
the presence of preferred nitrogen sources (27, 28). Ammonium, 
at the concentrations used in our transcriptional profiling exper- 
iments, is sufficient to trigger NMR in N. crassa (27). This fact 
made it all the more striking that alc-1, a gene that ought to be 
under NMR, was highly expressed among wild Louisiana isolates 
of N. crassa (Fig. 3). This finding of unusual expression of an NMR 
target gene led us to further investigate interpopulation differ- 
ences in metabolic repression programs. To analyze as many NMR 
targets as possible, we began with the set of NMR targets previ- 
ously characterized in Neurospora, and we added to it Neurospora 
genes detected by a best-reciprocal-BLAST search of the N. crassa 
genome using known NMR targets in other fungi (see Data Set S4 
in the supplemental material). Of these, we examined the 15 genes 
whose expression differed between the Louisiana and Caribbean 
populations and found 9 that were upregulated in Louisiana indi- 
viduals relative to those of the Caribbean and Panamanian popu- 
lations (Fig. 3, yellow-shaded panels). Finding such a large num- 
ber of genes differentially upregulated in the same population 
constitutes a directional coherence significantly greater than that 
expected by chance from the genome as a whole (one-tailed Fish- 
er's exact test, P = 0.005). Thus, Louisiana isolates of N. crassa 
exhibited a program of high expression of NMR targets in rich 
medium that was unique among our study populations. 

PP4 controls expression of nitrogen metabolite repression 
and carbon catabolite repression targets. The NMR target genes 
differentially expressed between the Louisiana and Caribbean 
N. crassa clades also exhibited expression-level variation within 
the Louisiana population (Fig. 3). To complement our interpop- 
ulation comparison, we took a candidate-gene approach in hopes 
of identifying master regulators of metabolite expression pro- 
grams, as follows. We identified N. crassa orthologs of the set of 
known NMR regulators in S. cerevisiae (29) (see Data Set S4 in the 
supplemental material) and examined their coding regions for 
sequence variation among Louisiana N. crassa strains. Among 
these genes, the only case of striking sequence difference between 
strains was in the putative protein phosphatase PP4 (NCU08301 ), 
a regulator of the circadian oscillator FRQ in N. crassa (30) whose 
homolog in S. cerevisiae activates the NMR transcription factor 
Gln3 (31); Louisiana individuals bore two derived amino acid 
variants in PP4 at high frequency relative to Caribbean strains 
(Fig. 4). To investigate the regulatory impact of pp4, we therefore 



transcriptionally profiled a strain harboring a deletion in this gene 
alongside an isogenic wild-type strain. Comparison between the 
two expression profiles identified 195 differentially expressed 
genes (Benjamini-Hochberg-corrected P value < 0.1; see Data 
Set S2), of which nearly all ( 1 87 genes) showed elevated expression 
upon pp4 deletion. The response to the deletion of pp4 was signif- 
icantly more dramatic among NMR targets than expected under a 
null hypothesis based on genomic resampling (P < 0.0001), con- 
sistent with a role for PP4 in the repression of NMR target genes. 
In addition, we noted in the pp4 deletion signature a number of 
genes involved in the metabolism of alternative carbon sources, 
including the mannose metabolism genes NCU07067, 
NCU02322, NCU07269, and NCU07318 and the polysaccharide 
metabolism genes NCU08755, NCU04959, NCU09281, and 
NCU04431 (see Data Set S2). We thus suspected that in addition 
to the NMR genes we had originally analyzed (Fig. 3, yellow- 
shaded panels), PP4 also functioned as a repressor of genes subject 
to carbon catabolite repression (CCR) in glucose medium, which 
we call CCR targets. As an unbiased test for the role of pp4 in 
carbon catabolite repression, we considered the 75 genes whose 
expression increases upon deletion of the N. crassa carbon catab- 
olite repressor CreA/CREl (32). Eight of these CCR (CRE1) tar- 
gets were differentially expressed in the pp4 deletion, an overlap 
beyond that expected by chance (hypergeometric P = 0.0001). 
Furthermore, seven of the eight targets shared by CRE1 and PP4 
showed elevated expression in the pp4 deletion strain, further sup- 
porting a model of PP4 as a repressor of CCR targets. We thus 
conclude that PP4 functions in the joint regulatory control of 
nitrogen and carbon metabolism genes in N. crassa. 

We next hypothesized that if the naturally occurring sequence 
variants among Louisiana strains in pp4 affected the function of 
this gene, inheritance at pp4 would be correlated with expression 
of nitrogen and carbon metabolism pathways across the Louisiana 
population. To test this notion, we examined both our directly 
inferred NMR targets and a broader curated set of nitrogen me- 
tabolism genes inferred from the nitrogen starvation transcrip- 
tional response of Magnaporthe grisea (33) (see Data Set S4 in the 
supplemental material; see also Materials and Methods) for asso- 
ciation of their expression with inheritance at pp4 across Louisi- 
ana strains. We likewise tested for association between pp4 geno- 
type and expression of CCR targets, again using the CRE1 target 
gene set as a reflection of the latter (32). Unexpectedly, the results 
were not consonant with natural variation in the pp4 sequence 
having a major impact on gene expression: expression of only one 
gene, NCU00789, showed modest association with the pp4 geno- 
type across Louisiana strains (Benjamini-Hochberg-corrected P 
value = 0.12; Fig. 4). Thus, despite the dramatic impact of an 
engineered deletion in pp4, we concluded that natural polymor- 
phisms in pp4 did not underscore most of the variation in metab- 
olite repression genes within or between wild N. crassa popula- 
tions. However, given that our exploration of pp4 led us to the 
uncharacterized gene NCU00789, we considered the latter in its 
own right as another candidate gene for inference of novel func- 
tion in N. crassa. 

Nitrogen and carbon metabolism gene expression associates 
with Nc_nmr6. GenBank lists the protein encoded by NCU00789 
(accession no. AY935520.1) as the N. crassa ortholog of the 
Hansenula polymorpha gene NMR6, an unpublished, 12- 
transmembrane ammonium sensor, and we refer to the Neuro- 
spora version of this gene as Nc_nmr6. The Nc_nmr6 locus har- 
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FIG 3 Targets of carbon catabolite and nitrogen metabolite repression are upregulated in the N. crassa Louisiana population. Each panel shows expression of 
one target gene of the nitrogen metabolite repression program (yellow shading; see Data Set S4 in the supplemental material) or of the carbon catabolite 
repression regulator CRE1 (orange shading [32] ), measured in wild strains of N. crassa grown on standard rich medium. In a given panel, each column reports 
the distribution of expression of the indicated gene across the strains of one N. crassa population: LA, Louisiana; CARIB, Caribbean; PAN, Panama. Symbols are 
as in Fig. 2. NMR, nitrogen metabolite repression targets; CCR, carbon catabolite repression targets. Systematic gene identifiers, from top left to bottom right, are 
NCU01066, NCU01816, NCU02333, NCU03076, NCU03257, NCU05387, NCU07675, NCU08356, NCU10007, NCU00721, NCU00130, NCU01140, 
NCU01449, NCU02904, NCU03151, NCU04039, NCU04197, NCU04460, NCU04963, NCU08746, NCU10021, and NCU07363. 
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FIG 4 A derived, high-frequency nonsynonymous variant in pp4 is associated with the expression level of a novel putative ammonium sensor. (A) The top panel 
shows an amino acid alignment of a region of the putative protein phosphatase PP4, in which the major allele among Louisiana strains of N. crassa (N_crassa_LA) 
encodes a glycine at residue 177, whereas the major allele among Caribbean isolates (N_crassa_CARIB) and the sequence in other Sordariomycetes encodes an 
aspartate. The inset shows allele frequencies of the aspartate and glycine alleles in the Louisiana and Caribbean N. crassa populations. (B) Each column reports 
the expression of Nc_nmr6 (NCU00789) across Louisiana JV. crassa strains harboring the indicated allele at the PP4 variant in panel A. The Nc_nmr6 expression 
level associated at modest significance with the pp4 genotype across Louisiana strains (nominal P = 0.0001; Benjamini-Hochberg-correctedP = 0.12). Symbols 
are as in Fig. 2. 



bored extensive variation at coding and silent sites across N. crassa 
isolates (Fig. 5A), which defined two haplotypes. One Nc_nmr6 
haplotype was borne by most strains of the Caribbean population 
and a few strains from Louisiana, and a second Nc_nmr6 haplo- 
type was apparent in the majority of Louisiana isolates. We thus 
hypothesized that Nc_nmr6 was a previously uncharacterized 
component of the nitrogen and carbon metabolism gene network 
in N. crassa and that polymorphisms at Nc_nmr6 served to tune 
the expression of its target genes even in medium containing pre- 
ferred carbon and nitrogen sources. 

To validate our inference of Nc_nmr6 as a determinant of met- 
abolic gene expression, we first investigated the relationship be- 
tween inheritance at Nc_nmr6 and NMR target gene expression 
across Louisiana individuals. The results revealed a robust associ- 
ation signal, with the major haplotype of Nc_nmr6 associated with 
increased NMR target gene expression among Louisiana strains 
(one-tailed paired Wilcoxon P value = 0.006). We next evaluated 
the effect of the Nc_nmr6 genotype on the broader set of nitrogen 
starvation genes inferred from profiles of M. grisea (33). Again we 
observed a striking relationship (Fig. 5B and C): among the 62 
genes for which the major allele of Nc_nmr6 among Louisiana 
individuals was associated with lower expression, largely compo- 
nents of the translation machinery, 50 were downregulated during 
nitrogen starvation (Fig. 5B) (binomial P = 9.204e— 05). Like- 
wise, of the 37 genes activated by the major allele of Nc_nmr6, 31 
were upregulated during nitrogen starvation (Fig. 5C; binomial P 
= 8.197e— 07). The latter included genes involved in the metabo- 



lism of alternative carbon sources (e.g., the xylanase NCU08189, 
the gluconate reductase NCU09519, and the rhamnose synthase 
NCU10683), as well as nitrogen metabolism genes (e.g., the 
uricase NCU07853 and the pyrimidine catabolism gene hydantoi- 
nase NCU00689). We thus suspected that Nc_NMR6 had a role in 
the expression of CCR targets as well as NMR targets. To evaluate 
the impact of Nc_nmr6 on carbon catabolite repression, we exam- 
ined the association between the Nc_nmr6 genotype and expres- 
sion of CRE1 target genes (32). Of the latter, 18 were differentially 
expressed between our Louisiana and Caribbean N. crassa popu- 
lations, with 14 exhibiting a derived program of increased expres- 
sion in Louisiana individuals relative to outgroups (one-tailed 
Fisher's exact P value = 0.005; Fig. 3, orange-shaded panels). As 
predicted, among Louisiana individuals, the major Nc_nmr6 hap- 
lotype was associated with high expression of CRE1 targets (one- 
tailed paired Wilcoxon P value = 0.005). We conclude that the 
major allele of Nc_nmr6 among Louisiana strains is associated 
with an expression program in rich medium that mirrors the re- 
sponse to nitrogen starvation and the loss of carbon catabolite 
repression, strongly suggesting that Nc_NMR6, like PP4, func- 
tions in the regulation of genes that metabolize nonpreferred nu- 
trient sources. 

Nc_nmr6 is a novel determinant of metabolism gene expres- 
sion. As a direct test of the regulatory impact of Nc_nmr6, we 
constructed a strain bearing a deletion of this gene (see Materials 
and Methods), using as the genetic background FGSC 2489, which 
bears the major allele from the Louisiana population. We tran- 
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FIG 5 A derived haplotype at the Nc_nmr6 locus (NCU00789) is associated with a gene expression program that mirrors the response to nitrogen starvation. 
(A) Alignment and insets are as in Fig. 4A, except that a polymorphic region of Nc_nmr6 is shown. (B and C) In a given panel, each row reports expression of one 
gene for which expression of the N. crassa ortholog was significantly associated with the genotype at Nc_nmr6 across Louisiana strains (nominal P < 0.05; see 
Materials and Methods for association test details), and expression of the M. grisea ortholog changed >2-fold upon nitrogen starvation in a laboratory strain (33). 
For a given row, the left column reports the log 2 of the average expression level of the respective gene among Louisiana strains bearing the derived haplotype of 
Nc_nmr6, relative to the analogous average across strains with the ancestral haplotype; the right column reports the log 2 of the expression of the respective gene 
in nitrogen-starved M. grisea, relative to the analogous measurement in untreated cells. (B) Genes for which the derived haplotype of Nc_nmr6 was associated 
with high expression relative to the ancestral haplotype. (C) Genes for which the derived haplotype of Nc_nmr6 was associated with low expression relative to the 
ancestral haplotype. For ease of visualization, all association effect sizes in panels B and C were normalized by a multiplicative factor of 5. 



scriptionally profiled this ANc_nmr6 strain and its wild-type iso- 
genic progenitor after growth in rich medium, finding 80 genes 
that were differentially expressed between the strains (Benjamini- 
Hochberg-corrected P value < 0.1; see Data Set S2 in the supple- 
mental material), enriched for annotation in a number of func- 
tions, including carbohydrate metabolism (see Table S3). Analysis 
of the Nc_nmr6 deletion signature bore out our prediction of a 
role for Nc_nmr6 in nitrogen metabolism regulation: the suite of 
NMR targets that we had originally ascertained based on upregu- 
lation in Louisiana strains relative to levels in other populations 
(Fig. 3, yellow-shaded panels) were expressed at low levels in the 
ANc_nmr6 mutant (resampling P value = 0.01; Fig. 6). Likewise, 
again conforming to our expectation, the engineered \Nc_nmr6 
mutant expressed CRE1 targets at lower levels than the wild-type 
control (Fig. 6). We conclude that Nc_NMR6 functions in the 
regulation of NMR and CCR gene targets and, in contrast to the 



behavior of the putative repressor PP4, is required for the high 
expression of these genes by the Louisiana strain FGSC 2489 in 
rich medium. Given the impact of both PP4 and Nc_NMR6 on 
multiple nutrient response pathways, our data thus implicate 
these proteins as two novel control points for expression of me- 
tabolite repression programs in N. crassa. 

DISCUSSION 

Transcripts that are up- and downregulated together across con- 
ditions may often encode proteins of similar functions. This idea 
has motivated studies of coregulation between unknown and well- 
characterized genes (34, 35), to shed light on novel, species- or 
condition-specific gene functions. A key roadblock in the field is 
that such highly specialized genes may be inactive under the ex- 
perimental conditions used for standard analyses of gene coregu- 
lation, even in microbes (35-37). In contrast, when natural ge- 
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FIG 6 Nc_ntnr6 (NCU00789) controls expression of target genes of the carbon catabolite and nitrogen metabolite repression programs. Each column reports 
the distribution, across the genes of the indicated set, of the change in expression between a strain harboring an engineered deletion in Nc_nmr6 (KO) and an 
isogenic wild-type (WT) strain. All genes, the complete set of genes expressed in the knockout experiment; DE between populations, the set of genes significantly 
differentially expressed between Louisiana and Caribbean N. crassa strains (see Data Set S3 in the supplemental material); NMR genes upregulated in LA, the set 
of inferred targets of the nitrogen metabolite repression program upregulated in Louisiana strains relative to other N. crassa populations (see yellow panels in 
Fig. 3); CCR genes upregulated in LA, the set of genes repressed by the carbon catabolite repression regulator CRE1 and upregulated in Louisiana strains relative 
to other N. crassa populations (see orange panels in Fig. 3). For the last three columns, the text at the top reports the resampling-based estimate of significance 
of the difference between the indicated regulon and the set of all genes with respect to the expression effects of Nc_nmr6 deletion. Symbols are as in Fig. 2, except 
that for ease of visualization, data points falling outside the box plot whiskers are not shown. 



netic variation impacts the expression of a wide range of genes, it 
can enable a broader survey of gene function than almost any 
other experimental design (36). To date, schemes using wild pop- 
ulations to predict gene function have been almost untested in the 
literature (9), and experimental follow-up of predictions from 
coexpression analyses has been at a premium. In this work, we 
established a pair of wild populations of N. crassa as a model sys- 
tem for the inference of gene function from expression profiles. 
This approach resulted in a diverse set of functional annotations 
enriched across 143 multigene coexpression clusters. Using the 
power of molecular genetics with N. crassa, we then experimen- 
tally validated the roles of unannotated genes from these clusters 
in filamentous fungus traits. 

Hyphal branching plays a critical role in the interaction be- 
tween a fungus and itself (formation of mycelial colonies), other 
organisms (nonself recognition), and its growth substrate (forag- 
ing) (38). Our analysis of the novel hyphal branching factor 
NCU04826 focused on a cluster of N. crassa genes whose products 
localize to cell or hyphal peripheries or have known functions as 
regulators of cell polarity and hyphal growth. The pattern of co- 
expression across strains indicates that expression of these up- 
stream regulators is itself under joint control in the regulatory 
network, suggestive of feedback. Our results implicate 
NCU04826, which we named hbc-1, for hyphal branching and 
cytoskeleton, as a novel determinant of the polar growth machin- 
ery. The hyperbranching phenotype of the NCU04826 deletion 
strain resembles the effects of mutations in the cytoskeleton as- 
sembly control protein NCU02978 and those in the cot-1 and 
pod-6 kinases, which block hyphal extension and force lateral 
rather than directional growth (39, 40). In light of the apparent 
absence of NCU04826 homologs in unicellular yeasts, this gene 



likely contributes to the distinct biology of polar growth in the 
filamentous fungi (41 ), highlighting the utility of dedicated anno- 
tation efforts for these species. 

Functional inference methods are in urgent demand for tran- 
scription factors, whose biological roles often remain unknown 
even in exhaustively studied model organisms like budding yeast 
(37). Our case study for characterization of a novel transcription 
factor was a coexpression cluster enriched for amino acid metab- 
olism genes that contained NCU05257, a gene encoding a protein 
with predicted zinc finger and homeobox DNA-binding domains. 
Of the genes whose expression dropped when NCU05257 was 
deleted, many overlapped with the original coexpression gene 
cluster and/or were implicated in amino acid metabolism and iron 
scavenging via siderophores. Challenging the ANCU05257 strain 
with an amino acid biosynthesis inhibitor demonstrated the im- 
portance of this gene in the response to amino acid starvation. We 
named the NCU05275 gene asi-1, for amino acid and siderophore 
regulation; our results provide a first window onto the potential 
function of this gene in coordinating the joint regulation of fungal 
siderophore biosynthesis with amino acid supply (42, 43). 

Besides analyzing expression variation across Louisiana strains 
of N. crassa, we also predicted gene function using the differences 
between two populations of this fungus, isolates from Louisiana 
and the southern Caribbean. The gene with the strongest inter- 
population transcription difference, that encoding allantoicase, is 
part of the NMR program. This observation led us to a focus on 
patterns of coordinated expression between nitrogen and carbon 
metabolism genes and the discovery of a role in these metabolic 
networks for two putative upstream signaling factors. Our exper- 
iments implicated the putative protein phosphatase gene pp4 in 
the repression of genes involved in the metabolism of both non- 
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preferred nitrogen and carbon sources. In S. cerevisiae, where nu- 
trient sensing networks have been well delineated, no protein 
phosphatase is known to participate in both carbon catabolite and 
nitrogen metabolite repression (44); it is thus tempting to specu- 
late that the division of labor among protein phosphatases in nu- 
trient sensing pathways has diverged between yeast and N. crassa. 
Likewise, given that PP4 has been well characterized in N. crassa as 
a regulator of the circadian clock machinery (30), we hypothesize 
that this protein may function as a novel component of the re- 
cently discovered gene network for circadian control of metabo- 
lism in N. crassa (45), plausibly linking the circadian clock with 
nitrogen and carbon scavenging under nutrient-poor conditions. 

Our analysis of the putative transmembrane protein Nc- 
_NMR6 identified a haplotype in Louisiana strains associated with 
high expression of genes involved in metabolism of nonpreferred 
carbon and nitrogen sources, even during growth in glucose- and 
ammonium-containing medium. The reduced expression of these 
nutrient-scavenging genes that we observed in an Nc_nmr6 dele- 
tion strain is consistent with either of two possible models for the 
function of the encoded protein. On the one hand, the ancestral 
haplotype of JVc_NMR6 could transduce a positive signal of the 
presence of preferred carbon and nitrogen sources, analogous to 
well-characterized glucose and ammonium receptors in many 
fungi (46-48). Under this model, the major allele of Nc_nmr6 in 
the Louisiana population would act as a dominant negative loss of 
function, failing to signal the availability of preferred nutrient 
sources in rich medium and leading to derepression of NMR and 
CCR targets. Upon deletion of this allele of Nc_nmr6 in the Lou- 
isiana genome, metabolite repression would be reinstated by 
CRE1, NMR1, and other NMR and CCR regulators in N. crassa 
(49, 50). Alternatively, the ancestral form of Nc_NMR6 could ac- 
tively transduce a signal in the absence of preferred nutrient 
sources, activating the metabolic machinery for nutrient scaveng- 
ing; the major haplotype in the Louisiana population would then 
act as a gain of function to elevate expression of NMR and CCR 
targets even in rich medium, and deletion of this allele would 
abrogate the expression effect. In either case, our results make 
clear that M:_NMR6, like PP4, functions as part of a complex 
network for the joint control of carbon and nitrogen metabolism. 
The regulatory impact of both genes dovetails with that of the 
sugar sensor TPS1 in Magnaporthe oryzae, which signals to regu- 
lators of genes for metabolism of nonpreferred nitrogen and car- 
bon sources (51-53). And our de novo inference of the functions of 
these previously uncharacterized genes underscores the power of 
the comparative transcriptomic approach for gene discovery. 

Coexpression analysis can be a powerful approach for the pre- 
diction of gene function, though to date, relatively few genome- 
scale studies have experimentally validated the in silico inference 
of a gene's role at the organismal level (54-59). By validating the 
regulatory and organismal impact of four novel genes, our work 
makes clear that expression differences between wild isolates can 
lead to biologically meaningful inferences of gene function, 
whether the loci of interest are involved in gross morphology or 
the regulation of metabolism. The coexpression clusters we have 
reported here contain 2,243 other unannotated genes well suited 
for a similar functional discovery pipeline, in which the deletion 
strain of any such gene can be assayed for traits in which its co- 
regulated partners are known to play a role. We hope that by 
demonstrating the approach, our work will stimulate other re- 
searchers to use the collection of Louisiana and Caribbean 



N. crassa strains that have been transcriptionally profiled to deter- 
mine functions for unknown genes in the pathways that interest 
them. Given the homology among genes in filamentous Ascomy- 
cota, the N. crassa collection will be of immediate use in inference 
of gene function in other species. And the strategies we have es- 
tablished for population-genomic transcriptional profiling and 
analysis in N. crassa will be relevant for future work in any fungus 
for which sufficient isolates can be collected. 

MATERIALS AND METHODS 

Strains, sequencing, and gene expression quantification. The complete 
set of wild Louisiana isolates and engineered strains used in this study is 
listed in Table S4 in the supplemental material and was obtained from the 
Fungal Genetics Stock Center (FGSC) (60). Our analysis of Louisiana 
strain transcriptomes used 48 of the RNA-seq libraries from reference 19, 
which, for analysis on the same footing as data for other N. crassa popu- 
lations (see below), we remapped as follows. For each RNA-seq data set, 
we mapped Illumina reads to the Neurospora crassa OR74A version 10 
reference assembly (61) using the software program TopHat (62). We 
required reads to map uniquely with a maximum of two mismatches 
and minimum and maximum intron lengths of 40 and 200 bp, 
respectively, for mapping, coverage, and split-segment searches. We 
supplied TopHat with the Broad Institute's February 2010 release of 
the N. crassa version 4 annotation (http://www.broadinstitute.org 
/annotation/genome/neurospora/) and also allowed the program to 
search for novel splice junctions; we then integrated all putative isoforms 
into a single expression measure per gene. We used a custom Perl script 
and the N. crassa annotation to calculate the number of raw reads over- 
lapping each gene model and used the third-quartile method from refer- 
ence 63 to normalize read counts between lanes. For expression measure- 
ments from wild strains, we excluded from further analysis any gene for 
which >50% of the individuals analyzed had an expression level of zero. 

Expression clusters. To generate expression clusters from transcrip- 
tional profiles of wild Louisiana strains of N. crassa, we first normalized 
read counts for each gene in each RNA-seq library by gene length. We next 
used the Statistics::RankCorrelation Perl module to calculate Spearman's 
rank correlation coefficients between all pairwise comparisons of the 
8,361 N. crassa genes using expression levels from the 48 Louisiana indi- 
viduals. We used the absolute value of each correlation coefficient as input 
into the hclust package in the R software environment (64) and applied 
the method of reference 65 to define groups of coregulated genes. Briefly, 
given a correlation coefficient R, each cluster contains a set of genes 
among which any pair exhibit correlated expression across the strains with 
a coefficient of at least R. To establish a cutoff for R in Table SI in the 
supplemental material, we permuted expression measurements between 
individuals and repeated the clustering analysis; then, given a cluster size s, 
in this null set we tabulated the number fi permut of clusters containing at 
least s genes. Across 10 such permutations, for R = 0.4 and s = 9, « permut 
was zero (see Table SI); clusters of size 9 or larger in analysis of the real 
data are reported in Data Set S 1 . 

Functional enrichment tests in clusters. We tested each cluster of 
genes subject to coregulation across Louisiana strains for enrichment of 
gene functions as follows. We downloaded N. crassa gene annotations 
from the FunCat database maintained by the Munich Information Center 
for Protein Sequences (20). We eliminated from analysis all clusters that 
contained <9 genes and those with <2 genes with assigned FunCat an- 
notations; the final set comprised 143 clusters. We used a custom Perl 
script and the R function phyper to assess enrichment of each FunCat 
annotation term in each cluster. We used the Benjamini-Hochberg (66) 
correction for multiple hypothesis testing within each of the five FunCat 
levels; see Data Set SI in the supplemental material for lists of all FunCat 
terms and all clusters exhibiting enrichment at a corrected P value of 
<0.05. 

NCU04628 conservation. Homologs of NCU04628 were identified 
using the foint Genome Institute MycoCosm fungal genomics web re- 



March/April 2014 Volume 5 Issue 2 e01046-13 



Bio' mbio.asm.org 9 



Ellison et al. 



source (67). Multiple sequence alignment was performed using the soft- 
ware program MUSCLE (68). 

Hyphal morphology assays. The deletion strain for NCU04826 
(FGSC 16805) (6) (mating type a) was crossed to an isogenic wild-type 
control (mating type A; FGSC 2489) on Westergaard synthetic crossing 
medium (69), and progeny were screened for hygromycin resistance on 
sorbose medium (70) with and without 200 ju,g/ml hygromycin (Sigma). 
Five hygromycin-resistant and five hygromycin-sensitive progeny were 
inoculated onto petri dishes containing Vogel's minimal medium (VMM) 
(71) and allowed to grow overnight. Strains were observed under a dis- 
secting microscope, and hyphal branching was scored as follows. For each 
hypha, the number of nodes n n was counted, starting from the tip and 
moving toward the center of the colony, until the first subtending branch 
that was also branching was reached. Values of n n were averaged across 20 
randomly selected hyphal tips for each individual. 

For imaging of colonies of strains harboring deletions in genes coregu- 
lated with NCU04826 in Fig. SI in the supplemental material, each dele- 
tion strain in the FGSC 2489 background (6) was obtained from the Fun- 
gal Genetics Stock Center and grown on VMM overnight. 

Generation of Nc_ntnr6 deletion strain. The gene knockout proce- 
dure for NCU00789 was modified from a previously published method 
(6) as follows. The strain FGSC 9717 (mus-51::bar a) was grown for 3 days 
at 30°C in the dark, followed by another 7 days at room temperature, in 
100 ml VMM containing 2% (wt/vol) sucrose and 1.5% (wt/vol) agar. 
Conidia were collected by filtration, washed three times with ice-cold 1 M 
sorbitol, and resuspended in a final volume of 5 ml. An aliquot of 90 p.1 
(~10 9 conidia) was mixed with 1.0 p,g of DNA encoding an NCU00789 
deletion cassette (the sequence of the hph hygromycin resistance gene [72] 
flanked by ~1 kb of the genomic region upstream of NCU00789 in the 
N. crassa reference genome and ~ 1 kb of the region downstream of the 
gene, kindly provided by Carol Ringelberg, Department of Genetics, Dart- 
mouth Medical School). DNA entry into conidia was via electroporation 
on a Bio-Rad Gene Pulser II with the following settings: 1.5 kV, 600 ft, and 
25 /aF. Treated conidia were then mixed with 900 u.1 ice-cold 1 M sorbitol 
and 30 ml of a top agar solution containing 20% sucrose, 0.5% fructose, 
and 0.5% glucose (FGS) warmed to 50°C. This mixture was overlaid on 
each of three agar plates containing FGS medium and 200 fig/ml hygro- 
mycin, and subsequently incubated at 30°C for 3 days. Individual colonies 
were used to inoculate agar slants (3 ml VMM containing 2% sucrose and 
200 /ng/ml hygromycin), from which DNA was isolated and screened by 
PCR with the primers 5' TGCAATAGGTCAGGCTCT 3' (hyg) and 5' G 
CGGATAACAATTTCACACAG 3' (NCU00789) using Phire Hot Start 
polymerase according to the manufacturer's recommendations (Thermo 
Scientific; catalogue no. F-130). 

To reduce the likelihood of background mutations in NCU00789 de- 
letion strains, each PCR-confirmed mutant was backcrossed to a wild type 
as follows. FGSC 2489 was grown as a female strain on Westergaard me- 
dium (69) for 7 days to allow formation of protoperithecia and crossed 
with conidia from each deletion strain of the opposite mating type. Asco- 
spores for each cross were collected, placed in sterile water, and induced to 
germinate via incubation at 60°C for 30 min. Treated ascospores were 
spread on VMM-2% sucrose plates containing 400 p,g/ml hygromycin 
and grown at 30°C for 14 h. Germinated spores were isolated and used to 
inoculate 3-ml VMM-2% sucrose slants. Cultures were checked for loss of 
the bar gene by lack of growth on VMM without NH 4 N0 3 , 0.5% 
L-proline, 2% sucrose, and 400 jug/ml ignite. Progeny from each cross 
were also checked for integration of the hygromycin cassette by PCR as 
described above. 

Transcriptional profiling of N. crassa deletion strains. Following 
methods described elsewhere (73), we cultured, harvested, and isolated 
RNA from one replicate of the pp4 deletion strain (FGSC 12454) (6), one 
of the NCU05257 deletion strain (FGSC 16020) (6), and one of OR74A 
(FGSC 4200) as a wild- type control, and separately, we isolated RNA from 
one replicate of the NCU00789 deletion strain and an OR74A wild-type 
control. Multiplex library construction from each isolate's cDNA was 



done according to the manufacturer's protocol (Truseq v2 LT sample 
prep kit; Illumina) except that adaptors were diluted 5 X before ligation to 
blunt end cDNA. The RNA-seq libraries were sequenced on an HiSeq2500 
instrument (Illumina) as single-end 50-bp reads. 

In each RNA-seq data set, we eliminated from analysis all genes with 
<5 mapped reads. We then assessed the significance of differential expres- 
sion of each gene between each deletion strain and its respective control 
using the R software package DEseq (74), normalizing read counts be- 
tween lanes in each comparison using the estimateSizeFactors function 
and using expression variance across the pooled samples as described 
previously (74). Genome-wide expression measurements for each dele- 
tion strain are given in Data Set S2 in the supplemental material. 

We used a resampling approach to evaluate the expression response of 
a regulon of interest in the NCU05257 (asi-1), NCU08301 (pp4), or 
NCU00789 (Nc_nmr6) deletion strain transcriptional profile. For this 
purpose, we first tabulated the average of the log 10 fold change in expres- 
sion between the wild type and the deletion strain across the genes of the 
regulon. We then applied the same procedure to a gene set of the same size 
as the true regulon, randomly drawn from the pool of expressed genes 
(excluding those present in the regulon). The latter resampling procedure 
was repeated 10,000 times. Significance, in a one-tailed test of the hypoth- 
esis that the true regulon was expressed at lower levels than the null ex- 
pectation, was assessed as the proportion of resampled gene sets whose 
average expression ratios were less than or equal to that of the true regu- 
lon. 

Amino acid starvation assays. The deletion strain for NCU05257 
(FGSC 16020) was crossed to an isogenic wild-type control (FGSC 4200), 
and five progeny strains bearing the deletion and five wild-type progeny 
were recovered, as described above. Agar plugs were taken from each 
progeny culture and used to inoculate race tubes (75) containing VMM 
(71) with and without 6 mM 3-AT (Sigma). Each tube was incubated in 
constant light at 25°C, and the location of the hyphal front was recorded 
daily until it reached the opposite end of the tube. For comparative pur- 
poses, we also measured the growth rate for three replicates of a cpc-1 
mutant strain (FGSC 4264) (76) on race tubes containing VMM with and 
without 6 mM 3-AT. 

Differential expression between N. crassa populations. We calcu- 
lated normalized expression measurements for each of 19 Caribbean iso- 
lates and 3 Panamanian isolates of N. crassa as described above, using 
previously published RNA-seq data (11). Given these measures and the 
expression profiles of Louisiana isolates as described above, we tested each 
gene in turn for differential expression between the Caribbean and Loui- 
siana populations using the Wilcoxon test; we then corrected these nom- 
inal empirical P values using the Benjamini-Hochberg method (66) for 
multiple hypothesis testing. Using the set of 1,539 genes with differential 
expression significant at a Benjamini-Hochberg-corrected P value of 
<0.05, we evaluated enrichment of FunCat annotations as above. 

Curation of nitrogen metabolism gene sets in N. crassa. To define 
the nitrogen metabolite repression program in N. crassa in Data Set S4 in 
the supplemental material, we used the NMR (nitrogen metabolite repres- 
sion) signature from Saccharotnyces cerevisiae (77) to identify 59 ho- 
mologs in the reference sequence of N. crassa by a best-reciprocal- BLAST 
search (protein BLAST with E value cutoff of le— 04). We also included 
NMR-controlled genes characterized for Neurospora (78-81) for a total of 
66 genes. 

To define a broader set of nitrogen metabolism genes in N. crassa, we 
downloaded expression measurements of Magnaporthe grisea under 
nitrogen-starved and nitrogen-replete conditions from reference 33 and 
retained for analysis all genes for which the absolute value of the log 2 
signal intensity ratio between the conditions was 2-fold or greater. We 
then used the FungiDB database (82) to identify N. crassa orthologs of 
these genes (see Data Set S4 in the supplemental material). We also added 
genes annotated as involved in nitrogen metabolism in N. crassa using GO 
(gene ontology) terms, FunCat terms, and the Broad Institute annotation 
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(http://www.broadinstitute.org/annotation/genome/neurospora/), for a 
total of 1,203 genes (see Data Set S4). 

As our set of candidate regulators of NMR in Data Set S4, we compiled 
a list of S. cerevisiae regulators from reference 29 and examined all Neu- 
rospora homologs of these genes using homology search criteria so as to 
include both orthologs and paralogs (protein BLAST without best- 
reciprocal-BLAST requirement; E value cutoff of le — 10). 

Carbon catabolite repressed genes in N. crassa. Our list of carbon 
metabolite repressed genes was derived from transcriptional profiling of 
the cre-1 deletion strain performed previously (32). Specifically, we used 
the 75 genes showing expression levels that were significantly elevated in 
the cre-1 deletion strain compared to wild-type levels when grown on 
minimal medium (see Table S2 in reference 32). 

Association of gene expression with genotype. To test expression of a 
given nitrogen metabolism gene for association with inheritance at the 
DNA level across Louisiana strains, given the normalized expression mea- 
surements for a transcript and genotypes at a single-nucleotide variant, we 
first split the strains into two groups on the basis of inheritance at the 
variant and then used the Wilcoxon test with Benjamini-Hochberg cor- 
rection as described above. 
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